Overview

Dataset statistics

Number of variables21
Number of observations322047
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory51.9 MiB
Average record size in memory169.0 B

Variable types

Numeric9
DateTime1
Categorical9
Text1
Boolean1

Alerts

city is highly overall correlated with state and 1 other fieldsHigh correlation
cluster is highly overall correlated with state and 1 other fieldsHigh correlation
day is highly overall correlated with locale_nameHigh correlation
dcoilwtico is highly overall correlated with id and 1 other fieldsHigh correlation
id is highly overall correlated with dcoilwtico and 1 other fieldsHigh correlation
locale is highly overall correlated with locale_nameHigh correlation
locale_name is highly overall correlated with day and 1 other fieldsHigh correlation
onpromotion is highly overall correlated with salesHigh correlation
sales is highly overall correlated with onpromotionHigh correlation
state is highly overall correlated with city and 2 other fieldsHigh correlation
store_nbr is highly overall correlated with type_of_storeHigh correlation
type_of_store is highly overall correlated with city and 3 other fieldsHigh correlation
year is highly overall correlated with dcoilwtico and 1 other fieldsHigh correlation
transferred is highly imbalanced (74.6%)Imbalance
family is uniformly distributedUniform
sales has 73481 (22.8%) zerosZeros
onpromotion has 238596 (74.1%) zerosZeros

Reproduction

Analysis started2024-02-28 18:01:34.957119
Analysis finished2024-02-28 18:03:32.192889
Duration1 minute and 57.24 seconds
Software versionydata-profiling vv4.6.5
Download configurationconfig.json

Variables

id
Real number (ℝ)

HIGH CORRELATION 

Distinct292545
Distinct (%)90.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1682979.5
Minimum561
Maximum3000887
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.9 MiB
2024-02-28T18:03:32.388793image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum561
5-th percentile326484.3
Q11010616.5
median1842406
Q32209555.5
95-th percentile2852916.7
Maximum3000887
Range3000326
Interquartile range (IQR)1198939

Descriptive statistics

Standard deviation786249.26
Coefficient of variation (CV)0.46717698
Kurtosis-1.0200574
Mean1682979.5
Median Absolute Deviation (MAD)641223
Skewness-0.22757083
Sum5.4199849 × 1011
Variance6.181879 × 1011
MonotonicityNot monotonic
2024-02-28T18:03:32.736282image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
961254 4
 
< 0.1%
961469 4
 
< 0.1%
961819 4
 
< 0.1%
961818 4
 
< 0.1%
961821 4
 
< 0.1%
961834 4
 
< 0.1%
961835 4
 
< 0.1%
961837 4
 
< 0.1%
961836 4
 
< 0.1%
961850 4
 
< 0.1%
Other values (292535) 322007
> 99.9%
ValueCountFrequency (%)
561 1
< 0.1%
562 1
< 0.1%
563 1
< 0.1%
564 1
< 0.1%
565 1
< 0.1%
566 1
< 0.1%
567 1
< 0.1%
568 1
< 0.1%
569 1
< 0.1%
570 1
< 0.1%
ValueCountFrequency (%)
3000887 1
< 0.1%
3000886 1
< 0.1%
3000885 1
< 0.1%
3000884 1
< 0.1%
3000883 1
< 0.1%
3000882 1
< 0.1%
3000881 1
< 0.1%
3000880 1
< 0.1%
3000879 1
< 0.1%
3000878 1
< 0.1%

date
Date

Distinct179
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
Minimum2013-01-01 00:00:00
Maximum2017-08-15 00:00:00
2024-02-28T18:03:33.073784image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:33.375019image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

store_nbr
Real number (ℝ)

HIGH CORRELATION 

Distinct54
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.994672
Minimum1
Maximum54
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.9 MiB
2024-02-28T18:03:33.715367image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q113
median27
Q340
95-th percentile51
Maximum54
Range53
Interquartile range (IQR)27

Descriptive statistics

Standard deviation15.595174
Coefficient of variation (CV)0.57771305
Kurtosis-1.2291947
Mean26.994672
Median Absolute Deviation (MAD)14
Skewness0.003300613
Sum8693553
Variance243.20946
MonotonicityNot monotonic
2024-02-28T18:03:34.050092image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 6402
 
2.0%
10 6402
 
2.0%
5 6402
 
2.0%
33 6402
 
2.0%
17 6402
 
2.0%
41 6402
 
2.0%
16 6402
 
2.0%
15 6402
 
2.0%
2 6402
 
2.0%
11 6402
 
2.0%
Other values (44) 258027
80.1%
ValueCountFrequency (%)
1 6402
2.0%
2 6402
2.0%
3 6402
2.0%
4 6402
2.0%
5 6402
2.0%
6 6402
2.0%
7 6402
2.0%
8 6402
2.0%
9 6402
2.0%
10 6402
2.0%
ValueCountFrequency (%)
54 6402
2.0%
53 5148
1.6%
52 429
 
0.1%
51 6402
2.0%
50 6402
2.0%
49 6402
2.0%
48 6402
2.0%
47 6402
2.0%
46 6402
2.0%
45 6402
2.0%

family
Categorical

UNIFORM 

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
AUTOMOTIVE
 
9759
MAGAZINES
 
9759
LIQUOR,WINE,BEER
 
9759
LINGERIE
 
9759
LAWN AND GARDEN
 
9759
Other values (28)
273252 

Length

Max length26
Median length16
Mean length10.757576
Min length4

Characters and Unicode

Total characters3464445
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAUTOMOTIVE
2nd rowMAGAZINES
3rd rowLIQUOR,WINE,BEER
4th rowLINGERIE
5th rowLAWN AND GARDEN

Common Values

ValueCountFrequency (%)
AUTOMOTIVE 9759
 
3.0%
MAGAZINES 9759
 
3.0%
LIQUOR,WINE,BEER 9759
 
3.0%
LINGERIE 9759
 
3.0%
LAWN AND GARDEN 9759
 
3.0%
LADIESWEAR 9759
 
3.0%
HOME CARE 9759
 
3.0%
HOME APPLIANCES 9759
 
3.0%
HOME AND KITCHEN II 9759
 
3.0%
HOME AND KITCHEN I 9759
 
3.0%
Other values (23) 224457
69.7%

Length

2024-02-28T18:03:34.366826image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and 48795
 
9.1%
home 39036
 
7.3%
care 29277
 
5.5%
kitchen 19518
 
3.6%
grocery 19518
 
3.6%
foods 19518
 
3.6%
supplies 19518
 
3.6%
ii 19518
 
3.6%
i 19518
 
3.6%
baby 9759
 
1.8%
Other values (30) 292770
54.5%

Most occurring characters

ValueCountFrequency (%)
E 468432
13.5%
A 312288
 
9.0%
R 263493
 
7.6%
O 253734
 
7.3%
I 234216
 
6.8%
214698
 
6.2%
N 185421
 
5.4%
S 175662
 
5.1%
D 156144
 
4.5%
L 146385
 
4.2%
Other values (17) 1053972
30.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3220470
93.0%
Space Separator 214698
 
6.2%
Other Punctuation 29277
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 468432
14.5%
A 312288
 
9.7%
R 263493
 
8.2%
O 253734
 
7.9%
I 234216
 
7.3%
N 185421
 
5.8%
S 175662
 
5.5%
D 156144
 
4.8%
L 146385
 
4.5%
C 146385
 
4.5%
Other values (14) 878310
27.3%
Other Punctuation
ValueCountFrequency (%)
, 19518
66.7%
/ 9759
33.3%
Space Separator
ValueCountFrequency (%)
214698
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3220470
93.0%
Common 243975
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 468432
14.5%
A 312288
 
9.7%
R 263493
 
8.2%
O 253734
 
7.9%
I 234216
 
7.3%
N 185421
 
5.8%
S 175662
 
5.5%
D 156144
 
4.8%
L 146385
 
4.5%
C 146385
 
4.5%
Other values (14) 878310
27.3%
Common
ValueCountFrequency (%)
214698
88.0%
, 19518
 
8.0%
/ 9759
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3464445
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 468432
13.5%
A 312288
 
9.0%
R 263493
 
7.6%
O 253734
 
7.3%
I 234216
 
6.8%
214698
 
6.2%
N 185421
 
5.4%
S 175662
 
5.1%
D 156144
 
4.5%
L 146385
 
4.2%
Other values (17) 1053972
30.4%

sales
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct56108
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean406.38345
Minimum0
Maximum124717
Zeros73481
Zeros (%)22.8%
Negative0
Negative (%)0.0%
Memory size4.9 MiB
2024-02-28T18:03:34.632467image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median19
Q3241.2605
95-th percentile2226.7978
Maximum124717
Range124717
Interquartile range (IQR)240.2605

Descriptive statistics

Standard deviation1246.8812
Coefficient of variation (CV)3.0682382
Kurtosis675.05566
Mean406.38345
Median Absolute Deviation (MAD)19
Skewness13.376369
Sum1.3087457 × 108
Variance1554712.8
MonotonicityNot monotonic
2024-02-28T18:03:35.176353image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 73481
 
22.8%
1 14109
 
4.4%
2 10532
 
3.3%
3 8323
 
2.6%
4 7086
 
2.2%
5 6137
 
1.9%
6 5500
 
1.7%
7 4687
 
1.5%
8 4065
 
1.3%
9 3733
 
1.2%
Other values (56098) 184394
57.3%
ValueCountFrequency (%)
0 73481
22.8%
0.188 1
 
< 0.1%
0.294 1
 
< 0.1%
0.3 1
 
< 0.1%
0.33 1
 
< 0.1%
0.338 1
 
< 0.1%
0.396 1
 
< 0.1%
0.446 1
 
< 0.1%
0.47 1
 
< 0.1%
0.486 1
 
< 0.1%
ValueCountFrequency (%)
124717 1
< 0.1%
89576.36 1
< 0.1%
87438.516 2
< 0.1%
76090 1
< 0.1%
63434 1
< 0.1%
53874 2
< 0.1%
46271 1
< 0.1%
45361 1
< 0.1%
33274 1
< 0.1%
31851.158 1
< 0.1%

onpromotion
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct285
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.7271361
Minimum0
Maximum716
Zeros238596
Zeros (%)74.1%
Negative0
Negative (%)0.0%
Memory size4.9 MiB
2024-02-28T18:03:35.710444image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile21
Maximum716
Range716
Interquartile range (IQR)1

Descriptive statistics

Standard deviation15.512095
Coefficient of variation (CV)4.1619342
Kurtosis240.59292
Mean3.7271361
Median Absolute Deviation (MAD)0
Skewness10.93287
Sum1200313
Variance240.62509
MonotonicityNot monotonic
2024-02-28T18:03:36.218050image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 238596
74.1%
1 23351
 
7.3%
2 10068
 
3.1%
3 5900
 
1.8%
4 4381
 
1.4%
5 3387
 
1.1%
6 3048
 
0.9%
7 2557
 
0.8%
8 1975
 
0.6%
9 1753
 
0.5%
Other values (275) 27031
 
8.4%
ValueCountFrequency (%)
0 238596
74.1%
1 23351
 
7.3%
2 10068
 
3.1%
3 5900
 
1.8%
4 4381
 
1.4%
5 3387
 
1.1%
6 3048
 
0.9%
7 2557
 
0.8%
8 1975
 
0.6%
9 1753
 
0.5%
ValueCountFrequency (%)
716 1
< 0.1%
697 1
< 0.1%
672 1
< 0.1%
655 1
< 0.1%
646 1
< 0.1%
644 1
< 0.1%
642 1
< 0.1%
639 1
< 0.1%
633 1
< 0.1%
630 2
< 0.1%

city
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
Quito
112266 
Guayaquil
47784 
Santo Domingo
16104 
Cuenca
16005 
Ambato
12804 
Other values (17)
117084 

Length

Max length13
Median length10
Mean length6.8639205
Min length4

Characters and Unicode

Total characters2210505
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowQuito
2nd rowQuito
3rd rowQuito
4th rowQuito
5th rowQuito

Common Values

ValueCountFrequency (%)
Quito 112266
34.9%
Guayaquil 47784
14.8%
Santo Domingo 16104
 
5.0%
Cuenca 16005
 
5.0%
Ambato 12804
 
4.0%
Machala 12804
 
4.0%
Latacunga 12606
 
3.9%
Quevedo 6402
 
2.0%
Esmeraldas 6402
 
2.0%
Playas 6402
 
2.0%
Other values (12) 72468
22.5%

Length

2024-02-28T18:03:36.694166image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
quito 112266
32.6%
guayaquil 47784
13.9%
santo 16104
 
4.7%
domingo 16104
 
4.7%
cuenca 16005
 
4.6%
ambato 12804
 
3.7%
machala 12804
 
3.7%
latacunga 12606
 
3.7%
cayambe 6402
 
1.9%
ibarra 6402
 
1.9%
Other values (14) 85272
24.7%

Most occurring characters

ValueCountFrequency (%)
a 362010
16.4%
u 258753
11.7%
o 208428
 
9.4%
i 195129
 
8.8%
t 165594
 
7.5%
Q 118668
 
5.4%
l 92598
 
4.2%
n 85602
 
3.9%
y 70092
 
3.2%
e 60654
 
2.7%
Other values (24) 592977
26.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1843446
83.4%
Uppercase Letter 344553
 
15.6%
Space Separator 22506
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 362010
19.6%
u 258753
14.0%
o 208428
11.3%
i 195129
10.6%
t 165594
9.0%
l 92598
 
5.0%
n 85602
 
4.6%
y 70092
 
3.8%
e 60654
 
3.3%
m 54450
 
3.0%
Other values (10) 290136
15.7%
Uppercase Letter
ValueCountFrequency (%)
Q 118668
34.4%
G 54186
15.7%
C 28809
 
8.4%
L 25245
 
7.3%
S 22506
 
6.5%
D 22506
 
6.5%
M 18381
 
5.3%
E 12804
 
3.7%
A 12804
 
3.7%
P 9504
 
2.8%
Other values (3) 19140
 
5.6%
Space Separator
ValueCountFrequency (%)
22506
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2187999
99.0%
Common 22506
 
1.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 362010
16.5%
u 258753
11.8%
o 208428
 
9.5%
i 195129
 
8.9%
t 165594
 
7.6%
Q 118668
 
5.4%
l 92598
 
4.2%
n 85602
 
3.9%
y 70092
 
3.2%
e 60654
 
2.8%
Other values (23) 570471
26.1%
Common
ValueCountFrequency (%)
22506
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2210505
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 362010
16.4%
u 258753
11.7%
o 208428
 
9.4%
i 195129
 
8.8%
t 165594
 
7.5%
Q 118668
 
5.4%
l 92598
 
4.2%
n 85602
 
3.9%
y 70092
 
3.2%
e 60654
 
2.7%
Other values (24) 592977
26.8%

state
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
Pichincha
118668 
Guayas
66825 
Santo Domingo de los Tsachilas
16104 
Azuay
16005 
El Oro
12804 
Other values (11)
91641 

Length

Max length30
Median length11
Mean length8.8598217
Min length4

Characters and Unicode

Total characters2853279
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPichincha
2nd rowPichincha
3rd rowPichincha
4th rowPichincha
5th rowPichincha

Common Values

ValueCountFrequency (%)
Pichincha 118668
36.8%
Guayas 66825
20.8%
Santo Domingo de los Tsachilas 16104
 
5.0%
Azuay 16005
 
5.0%
El Oro 12804
 
4.0%
Tungurahua 12804
 
4.0%
Los Rios 12804
 
4.0%
Cotopaxi 12606
 
3.9%
Manabi 11979
 
3.7%
Esmeraldas 6402
 
2.0%
Other values (6) 35046
 
10.9%

Length

2024-02-28T18:03:37.104594image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pichincha 118668
28.4%
guayas 66825
16.0%
los 28908
 
6.9%
santo 16104
 
3.8%
domingo 16104
 
3.8%
de 16104
 
3.8%
tsachilas 16104
 
3.8%
azuay 16005
 
3.8%
tungurahua 12804
 
3.1%
rios 12804
 
3.1%
Other values (12) 98043
23.4%

Most occurring characters

ValueCountFrequency (%)
a 452067
15.8%
i 319671
11.2%
h 272580
 
9.6%
c 253440
 
8.9%
n 188463
 
6.6%
s 156651
 
5.5%
o 153516
 
5.4%
u 127644
 
4.5%
P 121770
 
4.3%
96426
 
3.4%
Other values (27) 711051
24.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2370588
83.1%
Uppercase Letter 386265
 
13.5%
Space Separator 96426
 
3.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 452067
19.1%
i 319671
13.5%
h 272580
11.5%
c 253440
10.7%
n 188463
8.0%
s 156651
 
6.6%
o 153516
 
6.5%
u 127644
 
5.4%
y 82830
 
3.5%
l 64218
 
2.7%
Other values (12) 299508
12.6%
Uppercase Letter
ValueCountFrequency (%)
P 121770
31.5%
G 66825
17.3%
T 28908
 
7.5%
E 25608
 
6.6%
S 22506
 
5.8%
L 19206
 
5.0%
C 18942
 
4.9%
D 16104
 
4.2%
A 16005
 
4.1%
R 12804
 
3.3%
Other values (4) 37587
 
9.7%
Space Separator
ValueCountFrequency (%)
96426
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2756853
96.6%
Common 96426
 
3.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 452067
16.4%
i 319671
11.6%
h 272580
9.9%
c 253440
9.2%
n 188463
 
6.8%
s 156651
 
5.7%
o 153516
 
5.6%
u 127644
 
4.6%
P 121770
 
4.4%
y 82830
 
3.0%
Other values (26) 628221
22.8%
Common
ValueCountFrequency (%)
96426
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2853279
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 452067
15.8%
i 319671
11.2%
h 272580
 
9.6%
c 253440
 
8.9%
n 188463
 
6.6%
s 156651
 
5.5%
o 153516
 
5.4%
u 127644
 
4.5%
P 121770
 
4.3%
96426
 
3.4%
Other values (27) 711051
24.9%

type_of_store
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
D
110121 
C
92367 
A
51645 
B
45144 
E
22770 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters322047
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD
2nd rowD
3rd rowD
4th rowD
5th rowD

Common Values

ValueCountFrequency (%)
D 110121
34.2%
C 92367
28.7%
A 51645
16.0%
B 45144
14.0%
E 22770
 
7.1%

Length

2024-02-28T18:03:37.557908image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-28T18:03:38.094791image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
d 110121
34.2%
c 92367
28.7%
a 51645
16.0%
b 45144
14.0%
e 22770
 
7.1%

Most occurring characters

ValueCountFrequency (%)
D 110121
34.2%
C 92367
28.7%
A 51645
16.0%
B 45144
14.0%
E 22770
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 322047
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 110121
34.2%
C 92367
28.7%
A 51645
16.0%
B 45144
14.0%
E 22770
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 322047
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 110121
34.2%
C 92367
28.7%
A 51645
16.0%
B 45144
14.0%
E 22770
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 322047
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
D 110121
34.2%
C 92367
28.7%
A 51645
16.0%
B 45144
14.0%
E 22770
 
7.1%

cluster
Real number (ℝ)

HIGH CORRELATION 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.531202
Minimum1
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.9 MiB
2024-02-28T18:03:38.498698image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median9
Q313
95-th percentile15
Maximum17
Range16
Interquartile range (IQR)9

Descriptive statistics

Standard deviation4.7138089
Coefficient of variation (CV)0.55253748
Kurtosis-1.2955902
Mean8.531202
Median Absolute Deviation (MAD)5
Skewness0.033899842
Sum2747448
Variance22.219994
MonotonicityNot monotonic
2024-02-28T18:03:38.853201image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
3 44715
13.9%
10 35574
11.0%
6 32736
10.2%
15 31812
9.9%
14 25608
8.0%
13 24354
7.6%
4 19206
 
6.0%
8 19206
 
6.0%
1 18546
 
5.8%
11 13233
 
4.1%
Other values (7) 57057
17.7%
ValueCountFrequency (%)
1 18546
5.8%
2 9603
 
3.0%
3 44715
13.9%
4 19206
6.0%
5 6402
 
2.0%
6 32736
10.2%
7 9438
 
2.9%
8 19206
6.0%
9 12804
 
4.0%
10 35574
11.0%
ValueCountFrequency (%)
17 6402
 
2.0%
16 6006
 
1.9%
15 31812
9.9%
14 25608
8.0%
13 24354
7.6%
12 6402
 
2.0%
11 13233
 
4.1%
10 35574
11.0%
9 12804
 
4.0%
8 19206
6.0%

transactions
Real number (ℝ)

Distinct3097
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1734.1178
Minimum54
Maximum8359
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.9 MiB
2024-02-28T18:03:39.149976image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum54
5-th percentile642
Q11030
median1409
Q32148
95-th percentile3788
Maximum8359
Range8305
Interquartile range (IQR)1118

Descriptive statistics

Standard deviation1050.335
Coefficient of variation (CV)0.60568838
Kurtosis4.8491261
Mean1734.1178
Median Absolute Deviation (MAD)483
Skewness1.8467475
Sum5.5846745 × 108
Variance1103203.7
MonotonicityNot monotonic
2024-02-28T18:03:39.864230image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1304 561
 
0.2%
1248 495
 
0.2%
1296 495
 
0.2%
1147 462
 
0.1%
1451 462
 
0.1%
1392 462
 
0.1%
1370 462
 
0.1%
875 462
 
0.1%
984 429
 
0.1%
1283 429
 
0.1%
Other values (3087) 317328
98.5%
ValueCountFrequency (%)
54 33
< 0.1%
353 33
< 0.1%
357 33
< 0.1%
381 33
< 0.1%
383 33
< 0.1%
386 33
< 0.1%
396 33
< 0.1%
400 66
< 0.1%
401 33
< 0.1%
406 33
< 0.1%
ValueCountFrequency (%)
8359 33
< 0.1%
8307 33
< 0.1%
8256 33
< 0.1%
8120 33
< 0.1%
8001 33
< 0.1%
7840 33
< 0.1%
7727 33
< 0.1%
7700 33
< 0.1%
7689 33
< 0.1%
7597 33
< 0.1%

type_of_holiday
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
Holiday
202818 
Event
59235 
Additional
41415 
Transfer
 
13695
Bridge
 
4884

Length

Max length10
Median length7
Mean length7.0452915
Min length5

Characters and Unicode

Total characters2268915
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHoliday
2nd rowHoliday
3rd rowHoliday
4th rowHoliday
5th rowHoliday

Common Values

ValueCountFrequency (%)
Holiday 202818
63.0%
Event 59235
 
18.4%
Additional 41415
 
12.9%
Transfer 13695
 
4.3%
Bridge 4884
 
1.5%

Length

2024-02-28T18:03:40.140573image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-28T18:03:40.428545image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
holiday 202818
63.0%
event 59235
 
18.4%
additional 41415
 
12.9%
transfer 13695
 
4.3%
bridge 4884
 
1.5%

Most occurring characters

ValueCountFrequency (%)
i 290532
12.8%
d 290532
12.8%
a 257928
11.4%
o 244233
10.8%
l 244233
10.8%
H 202818
8.9%
y 202818
8.9%
n 114345
 
5.0%
t 100650
 
4.4%
e 77814
 
3.4%
Other values (9) 243012
10.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1946868
85.8%
Uppercase Letter 322047
 
14.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 290532
14.9%
d 290532
14.9%
a 257928
13.2%
o 244233
12.5%
l 244233
12.5%
y 202818
10.4%
n 114345
 
5.9%
t 100650
 
5.2%
e 77814
 
4.0%
v 59235
 
3.0%
Other values (4) 64548
 
3.3%
Uppercase Letter
ValueCountFrequency (%)
H 202818
63.0%
E 59235
 
18.4%
A 41415
 
12.9%
T 13695
 
4.3%
B 4884
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 2268915
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 290532
12.8%
d 290532
12.8%
a 257928
11.4%
o 244233
10.8%
l 244233
10.8%
H 202818
8.9%
y 202818
8.9%
n 114345
 
5.0%
t 100650
 
4.4%
e 77814
 
3.4%
Other values (9) 243012
10.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2268915
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 290532
12.8%
d 290532
12.8%
a 257928
11.4%
o 244233
10.8%
l 244233
10.8%
H 202818
8.9%
y 202818
8.9%
n 114345
 
5.0%
t 100650
 
4.4%
e 77814
 
3.4%
Other values (9) 243012
10.7%

locale
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
National
160710 
Local
140415 
Regional
20922 

Length

Max length8
Median length8
Mean length6.6919766
Min length5

Characters and Unicode

Total characters2155131
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNational
2nd rowNational
3rd rowNational
4th rowNational
5th rowNational

Common Values

ValueCountFrequency (%)
National 160710
49.9%
Local 140415
43.6%
Regional 20922
 
6.5%

Length

2024-02-28T18:03:40.709246image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-28T18:03:40.991034image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
national 160710
49.9%
local 140415
43.6%
regional 20922
 
6.5%

Most occurring characters

ValueCountFrequency (%)
a 482757
22.4%
o 322047
14.9%
l 322047
14.9%
i 181632
 
8.4%
n 181632
 
8.4%
N 160710
 
7.5%
t 160710
 
7.5%
L 140415
 
6.5%
c 140415
 
6.5%
R 20922
 
1.0%
Other values (2) 41844
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1833084
85.1%
Uppercase Letter 322047
 
14.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 482757
26.3%
o 322047
17.6%
l 322047
17.6%
i 181632
 
9.9%
n 181632
 
9.9%
t 160710
 
8.8%
c 140415
 
7.7%
e 20922
 
1.1%
g 20922
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
N 160710
49.9%
L 140415
43.6%
R 20922
 
6.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 2155131
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 482757
22.4%
o 322047
14.9%
l 322047
14.9%
i 181632
 
8.4%
n 181632
 
8.4%
N 160710
 
7.5%
t 160710
 
7.5%
L 140415
 
6.5%
c 140415
 
6.5%
R 20922
 
1.0%
Other values (2) 41844
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2155131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 482757
22.4%
o 322047
14.9%
l 322047
14.9%
i 181632
 
8.4%
n 181632
 
8.4%
N 160710
 
7.5%
t 160710
 
7.5%
L 140415
 
6.5%
c 140415
 
6.5%
R 20922
 
1.0%
Other values (2) 41844
 
1.9%

locale_name
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
Ecuador
160710 
Riobamba
 
13266
Guayaquil
 
13200
Guaranda
 
11781
Latacunga
 
11352
Other values (19)
111738 

Length

Max length30
Median length7
Mean length7.6685111
Min length4

Characters and Unicode

Total characters2469621
Distinct characters36
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEcuador
2nd rowEcuador
3rd rowEcuador
4th rowEcuador
5th rowEcuador

Common Values

ValueCountFrequency (%)
Ecuador 160710
49.9%
Riobamba 13266
 
4.1%
Guayaquil 13200
 
4.1%
Guaranda 11781
 
3.7%
Latacunga 11352
 
3.5%
Ambato 8283
 
2.6%
Quito 8184
 
2.5%
Cuenca 6765
 
2.1%
Puyo 6666
 
2.1%
Libertad 6633
 
2.1%
Other values (14) 75207
23.4%

Length

2024-02-28T18:03:41.261000image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ecuador 160710
44.7%
riobamba 13266
 
3.7%
guayaquil 13200
 
3.7%
guaranda 11781
 
3.3%
domingo 11418
 
3.2%
santo 11418
 
3.2%
latacunga 11352
 
3.2%
ambato 8283
 
2.3%
quito 8184
 
2.3%
cuenca 6765
 
1.9%
Other values (19) 103125
28.7%

Most occurring characters

ValueCountFrequency (%)
a 431970
17.5%
o 260733
10.6%
u 243144
9.8%
r 203742
 
8.2%
d 197076
 
8.0%
c 188463
 
7.6%
E 178629
 
7.2%
n 79167
 
3.2%
i 69102
 
2.8%
t 62271
 
2.5%
Other values (26) 555324
22.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2082432
84.3%
Uppercase Letter 349734
 
14.2%
Space Separator 37455
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 431970
20.7%
o 260733
12.5%
u 243144
11.7%
r 203742
9.8%
d 197076
9.5%
c 188463
9.1%
n 79167
 
3.8%
i 69102
 
3.3%
t 62271
 
3.0%
b 59070
 
2.8%
Other values (12) 287694
13.8%
Uppercase Letter
ValueCountFrequency (%)
E 178629
51.1%
G 24981
 
7.1%
C 24453
 
7.0%
L 23067
 
6.6%
S 21351
 
6.1%
Q 14718
 
4.2%
R 13266
 
3.8%
D 11418
 
3.3%
M 9867
 
2.8%
A 8283
 
2.4%
Other values (3) 19701
 
5.6%
Space Separator
ValueCountFrequency (%)
37455
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2432166
98.5%
Common 37455
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 431970
17.8%
o 260733
10.7%
u 243144
10.0%
r 203742
 
8.4%
d 197076
 
8.1%
c 188463
 
7.7%
E 178629
 
7.3%
n 79167
 
3.3%
i 69102
 
2.8%
t 62271
 
2.6%
Other values (25) 517869
21.3%
Common
ValueCountFrequency (%)
37455
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2469621
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 431970
17.5%
o 260733
10.6%
u 243144
9.8%
r 203742
 
8.2%
d 197076
 
8.0%
c 188463
 
7.6%
E 178629
 
7.2%
n 79167
 
3.2%
i 69102
 
2.8%
t 62271
 
2.5%
Other values (26) 555324
22.5%
Distinct80
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
2024-02-28T18:03:41.688190image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length42
Median length34
Mean length21.204837
Min length8

Characters and Unicode

Total characters6828954
Distinct characters58
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCarnaval
2nd rowCarnaval
3rd rowCarnaval
4th rowCarnaval
5th rowCarnaval
ValueCountFrequency (%)
de 202488
22.2%
fundacion 69564
 
7.6%
cantonizacion 54318
 
6.0%
independencia 38346
 
4.2%
terremoto 36729
 
4.0%
del 21450
 
2.4%
provincializacion 20922
 
2.3%
dia 19965
 
2.2%
santo 19569
 
2.1%
primer 17094
 
1.9%
Other values (78) 410850
45.1%
2024-02-28T18:03:42.516963image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 887568
13.0%
n 671022
 
9.8%
589248
 
8.6%
i 555951
 
8.1%
e 521004
 
7.6%
d 502260
 
7.4%
o 465762
 
6.8%
c 263571
 
3.9%
r 255057
 
3.7%
u 217272
 
3.2%
Other values (48) 1900239
27.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5382069
78.8%
Uppercase Letter 661353
 
9.7%
Space Separator 589248
 
8.6%
Decimal Number 104280
 
1.5%
Math Symbol 41613
 
0.6%
Dash Punctuation 39534
 
0.6%
Other Punctuation 10857
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 887568
16.5%
n 671022
12.5%
i 555951
10.3%
e 521004
9.7%
d 502260
9.3%
o 465762
8.7%
c 263571
 
4.9%
r 255057
 
4.7%
u 217272
 
4.0%
t 205953
 
3.8%
Other values (15) 836649
15.5%
Uppercase Letter
ValueCountFrequency (%)
C 106788
16.1%
F 80817
12.2%
M 64053
9.7%
P 58179
8.8%
T 55110
8.3%
I 48048
7.3%
G 41745
 
6.3%
S 32604
 
4.9%
D 28215
 
4.3%
N 26565
 
4.0%
Other values (9) 119229
18.0%
Decimal Number
ValueCountFrequency (%)
1 41844
40.1%
2 20625
19.8%
3 12078
 
11.6%
4 6996
 
6.7%
6 5247
 
5.0%
0 5247
 
5.0%
7 3498
 
3.4%
5 3498
 
3.4%
9 3498
 
3.4%
8 1749
 
1.7%
Space Separator
ValueCountFrequency (%)
589248
100.0%
Math Symbol
ValueCountFrequency (%)
+ 41613
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 39534
100.0%
Other Punctuation
ValueCountFrequency (%)
: 10857
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6043422
88.5%
Common 785532
 
11.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 887568
14.7%
n 671022
11.1%
i 555951
 
9.2%
e 521004
 
8.6%
d 502260
 
8.3%
o 465762
 
7.7%
c 263571
 
4.4%
r 255057
 
4.2%
u 217272
 
3.6%
t 205953
 
3.4%
Other values (34) 1498002
24.8%
Common
ValueCountFrequency (%)
589248
75.0%
1 41844
 
5.3%
+ 41613
 
5.3%
- 39534
 
5.0%
2 20625
 
2.6%
3 12078
 
1.5%
: 10857
 
1.4%
4 6996
 
0.9%
6 5247
 
0.7%
0 5247
 
0.7%
Other values (4) 12243
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6828954
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 887568
13.0%
n 671022
 
9.8%
589248
 
8.6%
i 555951
 
8.1%
e 521004
 
7.6%
d 502260
 
7.4%
o 465762
 
6.8%
c 263571
 
3.9%
r 255057
 
3.7%
u 217272
 
3.2%
Other values (48) 1900239
27.8%

transferred
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
False
308352 
True
 
13695
ValueCountFrequency (%)
False 308352
95.7%
True 13695
 
4.3%
2024-02-28T18:03:42.848703image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

dcoilwtico
Real number (ℝ)

HIGH CORRELATION 

Distinct171
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean63.686222
Minimum27.96
Maximum107.95
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.9 MiB
2024-02-28T18:03:43.089834image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum27.96
5-th percentile37.46
Q144.88
median51.98
Q394.09
95-th percentile106.07
Maximum107.95
Range79.99
Interquartile range (IQR)49.21

Descriptive statistics

Standard deviation24.842082
Coefficient of variation (CV)0.39006996
Kurtosis-1.1593673
Mean63.686222
Median Absolute Deviation (MAD)9.03
Skewness0.71614784
Sum20509957
Variance617.12903
MonotonicityNot monotonic
2024-02-28T18:03:43.416883image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
107.04 6204
 
1.9%
53.19 5247
 
1.6%
59.59 4950
 
1.5%
104.76 4653
 
1.4%
95.25 4653
 
1.4%
46.02 3564
 
1.1%
46.21 3531
 
1.1%
36.12 3498
 
1.1%
43.18 3498
 
1.1%
51.98 3498
 
1.1%
Other values (161) 278751
86.6%
ValueCountFrequency (%)
27.96 1749
0.5%
29.71 1749
0.5%
34.55 1749
0.5%
34.57 1749
0.5%
35.36 1749
0.5%
36.12 3498
1.1%
36.76 1749
0.5%
37.13 1749
0.5%
37.46 1749
0.5%
37.62 1749
0.5%
ValueCountFrequency (%)
107.95 1551
 
0.5%
107.43 1551
 
0.5%
107.2 1551
 
0.5%
107.13 1518
 
0.5%
107.04 6204
1.9%
106.83 1551
 
0.5%
106.61 1551
 
0.5%
106.07 1551
 
0.5%
106.06 1551
 
0.5%
105.47 1518
 
0.5%

year
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
2016
99198 
2014
73524 
2015
65901 
2013
46266 
2017
37158 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1288188
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2013
2nd row2013
3rd row2013
4th row2013
5th row2013

Common Values

ValueCountFrequency (%)
2016 99198
30.8%
2014 73524
22.8%
2015 65901
20.5%
2013 46266
14.4%
2017 37158
 
11.5%

Length

2024-02-28T18:03:43.748536image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-28T18:03:44.041535image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
2016 99198
30.8%
2014 73524
22.8%
2015 65901
20.5%
2013 46266
14.4%
2017 37158
 
11.5%

Most occurring characters

ValueCountFrequency (%)
2 322047
25.0%
0 322047
25.0%
1 322047
25.0%
6 99198
 
7.7%
4 73524
 
5.7%
5 65901
 
5.1%
3 46266
 
3.6%
7 37158
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1288188
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 322047
25.0%
0 322047
25.0%
1 322047
25.0%
6 99198
 
7.7%
4 73524
 
5.7%
5 65901
 
5.1%
3 46266
 
3.6%
7 37158
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1288188
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 322047
25.0%
0 322047
25.0%
1 322047
25.0%
6 99198
 
7.7%
4 73524
 
5.7%
5 65901
 
5.1%
3 46266
 
3.6%
7 37158
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1288188
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 322047
25.0%
0 322047
25.0%
1 322047
25.0%
6 99198
 
7.7%
4 73524
 
5.7%
5 65901
 
5.1%
3 46266
 
3.6%
7 37158
 
2.9%

month
Real number (ℝ)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.3894866
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.9 MiB
2024-02-28T18:03:44.302727image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q15
median7
Q311
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.2425734
Coefficient of variation (CV)0.43880902
Kurtosis-1.3056153
Mean7.3894866
Median Absolute Deviation (MAD)3
Skewness0.092296768
Sum2379762
Variance10.514282
MonotonicityNot monotonic
2024-02-28T18:03:44.556145image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
12 51381
16.0%
4 50391
15.6%
11 45012
14.0%
5 40953
12.7%
7 37191
11.5%
6 27192
8.4%
8 25278
7.8%
10 14553
 
4.5%
2 13266
 
4.1%
3 9966
 
3.1%
Other values (2) 6864
 
2.1%
ValueCountFrequency (%)
1 3465
 
1.1%
2 13266
 
4.1%
3 9966
 
3.1%
4 50391
15.6%
5 40953
12.7%
6 27192
8.4%
7 37191
11.5%
8 25278
7.8%
9 3399
 
1.1%
10 14553
 
4.5%
ValueCountFrequency (%)
12 51381
16.0%
11 45012
14.0%
10 14553
 
4.5%
9 3399
 
1.1%
8 25278
7.8%
7 37191
11.5%
6 27192
8.4%
5 40953
12.7%
4 50391
15.6%
3 9966
 
3.1%

day
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.711548
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.9 MiB
2024-02-28T18:03:44.815627image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q16
median12
Q324
95-th percentile28
Maximum31
Range30
Interquartile range (IQR)18

Descriptive statistics

Standard deviation9.1952195
Coefficient of variation (CV)0.62503411
Kurtosis-1.4367679
Mean14.711548
Median Absolute Deviation (MAD)9
Skewness0.051555478
Sum4737810
Variance84.552062
MonotonicityNot monotonic
2024-02-28T18:03:45.083087image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
25 27654
 
8.6%
12 23133
 
7.2%
3 23034
 
7.2%
24 19932
 
6.2%
23 18084
 
5.6%
1 16137
 
5.0%
2 13662
 
4.2%
10 13596
 
4.2%
11 13200
 
4.1%
5 13167
 
4.1%
Other values (21) 140448
43.6%
ValueCountFrequency (%)
1 16137
5.0%
2 13662
4.2%
3 23034
7.2%
4 6567
 
2.0%
5 13167
4.1%
6 9933
3.1%
7 11385
3.5%
8 8382
 
2.6%
9 9933
3.1%
10 13596
4.2%
ValueCountFrequency (%)
31 4884
 
1.5%
30 3300
 
1.0%
29 3267
 
1.0%
28 10197
 
3.2%
27 6996
 
2.2%
26 9999
 
3.1%
25 27654
8.6%
24 19932
6.2%
23 18084
5.6%
22 11913
3.7%

day_of_week
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
4
75207 
0
74580 
1
61050 
3
58476 
2
52734 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters322047
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
4 75207
23.4%
0 74580
23.2%
1 61050
19.0%
3 58476
18.2%
2 52734
16.4%

Length

2024-02-28T18:03:45.398408image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-28T18:03:45.683501image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
4 75207
23.4%
0 74580
23.2%
1 61050
19.0%
3 58476
18.2%
2 52734
16.4%

Most occurring characters

ValueCountFrequency (%)
4 75207
23.4%
0 74580
23.2%
1 61050
19.0%
3 58476
18.2%
2 52734
16.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 322047
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 75207
23.4%
0 74580
23.2%
1 61050
19.0%
3 58476
18.2%
2 52734
16.4%

Most occurring scripts

ValueCountFrequency (%)
Common 322047
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 75207
23.4%
0 74580
23.2%
1 61050
19.0%
3 58476
18.2%
2 52734
16.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 322047
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 75207
23.4%
0 74580
23.2%
1 61050
19.0%
3 58476
18.2%
2 52734
16.4%

Interactions

2024-02-28T18:03:26.177236image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:01.823235image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:04.487145image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:07.475201image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:11.282262image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:13.990197image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:16.643434image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:19.208029image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:22.517652image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:26.479010image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:02.158274image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:04.779076image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:07.933253image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:11.580429image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:14.294387image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:16.935738image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:19.508996image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:23.341434image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:26.765552image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:02.465966image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:05.050022image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:08.290623image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:11.866093image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:14.581101image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:17.230314image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:19.798621image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:23.778187image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:27.052068image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:02.776890image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:05.328509image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:08.712436image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:12.188951image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:14.874596image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:17.510325image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:20.093321image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:24.190251image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:27.353164image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:03.061126image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:05.633279image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:09.145286image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:12.485350image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:15.188447image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:17.793401image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:20.395326image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:24.677129image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:27.657619image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:03.357299image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:05.922260image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:09.548698image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:12.798307image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:15.477163image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:18.087700image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:20.770506image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:24.995010image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:27.930926image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:03.642662image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:06.189474image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:09.969706image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:13.089850image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:15.759186image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:18.360027image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:21.188452image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:25.279414image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:28.227843image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:03.935537image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:06.649565image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:10.406177image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:13.402868image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:16.062315image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:18.659151image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:21.644274image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:25.600798image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:28.498903image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:04.206195image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:07.035471image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:10.739980image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:13.699050image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:16.358899image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:18.932805image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:22.042978image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-28T18:03:25.891545image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2024-02-28T18:03:45.940900image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
cityclusterdayday_of_weekdcoilwticofamilyidlocalelocale_namemonthonpromotionsalesstatestore_nbrtransactionstransferredtype_of_holidaytype_of_storeyear
city1.0000.080-0.0010.001-0.0010.0000.0000.0160.005-0.0040.0170.0471.000-0.3550.1980.0110.0130.6420.060
cluster0.0801.0000.0010.0000.0220.000-0.0200.0090.0040.001-0.0040.0250.515-0.0710.2190.0050.0070.7550.039
day-0.0010.0011.0000.167-0.0320.0000.0550.4010.5930.0560.019-0.0000.0070.0010.0340.3160.3140.0000.198
day_of_week0.0010.0000.1671.000-0.0090.0000.1000.1180.2320.0230.0990.0190.0000.002-0.0250.2120.1830.0000.104
dcoilwtico-0.0010.022-0.032-0.0091.0000.000-0.6890.2750.3340.021-0.220-0.1170.032-0.0080.0110.1670.2620.0200.577
family0.0000.0000.0000.0000.0001.0000.0000.0000.0000.000-0.091-0.0580.0000.0000.0000.0000.0000.0000.000
id0.000-0.0200.0550.100-0.6890.0001.0000.2940.322-0.0570.3670.1410.0400.016-0.0240.2610.3760.0230.906
locale0.0160.0090.4010.1180.2750.0000.2941.0001.000-0.0800.0060.0170.016-0.0010.0490.1080.4390.0040.183
locale_name0.0050.0040.5930.2320.3340.0000.3221.0001.0000.176-0.013-0.0220.0070.001-0.0320.2390.3780.0000.266
month-0.0040.0010.0560.0230.0210.000-0.057-0.0800.1761.000-0.0110.0340.0140.0010.1270.2810.3510.0070.281
onpromotion0.017-0.0040.0190.099-0.220-0.0910.3670.006-0.013-0.0111.0000.5810.0220.0350.0560.0240.0140.0240.042
sales0.0470.025-0.0000.019-0.117-0.0580.1410.017-0.0220.0340.5811.0000.0120.0300.1820.0020.0080.0280.006
state1.0000.5150.0070.0000.0320.0000.0400.0160.0070.0140.0220.0121.000-0.2350.3180.0110.0130.5190.054
store_nbr-0.355-0.0710.0010.002-0.0080.0000.016-0.0010.0010.0010.0350.030-0.2351.0000.0540.0070.0090.6700.043
transactions0.1980.2190.034-0.0250.0110.000-0.0240.049-0.0320.1270.0560.1820.3180.0541.0000.0480.1430.4150.046
transferred0.0110.0050.3160.2120.1670.0000.2610.1080.2390.2810.0240.0020.0110.0070.0481.0000.1620.0000.203
type_of_holiday0.0130.0070.3140.1830.2620.0000.3760.4390.3780.3510.0140.0080.0130.0090.1430.1621.0000.0030.258
type_of_store0.6420.7550.0000.0000.0200.0000.0230.0040.0000.0070.0240.0280.5190.6700.4150.0000.0031.0000.022
year0.0600.0390.1980.1040.5770.0000.9060.1830.2660.2810.0420.0060.0540.0430.0460.2030.2580.0221.000

Missing values

2024-02-28T18:03:29.293185image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-02-28T18:03:30.788893image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

iddatestore_nbrfamilysalesonpromotioncitystatetype_of_storeclustertransactionstype_of_holidaylocalelocale_namedescriptiontransferreddcoilwticoyearmonthdayday_of_week
073062.02013-02-111.0AUTOMOTIVE0.00.0QuitoPichinchaD13396HolidayNationalEcuadorCarnavalFalse97.0120132110
173085.02013-02-111.0MAGAZINES0.00.0QuitoPichinchaD13396HolidayNationalEcuadorCarnavalFalse97.0120132110
273084.02013-02-111.0LIQUOR,WINE,BEER21.00.0QuitoPichinchaD13396HolidayNationalEcuadorCarnavalFalse97.0120132110
373083.02013-02-111.0LINGERIE0.00.0QuitoPichinchaD13396HolidayNationalEcuadorCarnavalFalse97.0120132110
473082.02013-02-111.0LAWN AND GARDEN3.00.0QuitoPichinchaD13396HolidayNationalEcuadorCarnavalFalse97.0120132110
573081.02013-02-111.0LADIESWEAR0.00.0QuitoPichinchaD13396HolidayNationalEcuadorCarnavalFalse97.0120132110
673080.02013-02-111.0HOME CARE0.00.0QuitoPichinchaD13396HolidayNationalEcuadorCarnavalFalse97.0120132110
773079.02013-02-111.0HOME APPLIANCES0.00.0QuitoPichinchaD13396HolidayNationalEcuadorCarnavalFalse97.0120132110
873078.02013-02-111.0HOME AND KITCHEN II0.00.0QuitoPichinchaD13396HolidayNationalEcuadorCarnavalFalse97.0120132110
973077.02013-02-111.0HOME AND KITCHEN I0.00.0QuitoPichinchaD13396HolidayNationalEcuadorCarnavalFalse97.0120132110
iddatestore_nbrfamilysalesonpromotioncitystatetype_of_storeclustertransactionstype_of_holidaylocalelocale_namedescriptiontransferreddcoilwticoyearmonthdayday_of_week
3220371297880.02015-01-0125.0MAGAZINES0.000000.0SalinasSanta ElenaD12202HolidayNationalEcuadorPrimer dia del anoFalse53.452015113
3220381297882.02015-01-0125.0PERSONAL CARE203.000000.0SalinasSanta ElenaD12202HolidayNationalEcuadorPrimer dia del anoFalse53.452015113
3220391297881.02015-01-0125.0MEATS479.767002.0SalinasSanta ElenaD12202HolidayNationalEcuadorPrimer dia del anoFalse53.452015113
3220401297883.02015-01-0125.0PET SUPPLIES0.000000.0SalinasSanta ElenaD12202HolidayNationalEcuadorPrimer dia del anoFalse53.452015113
3220411297889.02015-01-0125.0SEAFOOD14.000000.0SalinasSanta ElenaD12202HolidayNationalEcuadorPrimer dia del anoFalse53.452015113
3220421297888.02015-01-0125.0SCHOOL AND OFFICE SUPPLIES0.000000.0SalinasSanta ElenaD12202HolidayNationalEcuadorPrimer dia del anoFalse53.452015113
3220431297887.02015-01-0125.0PRODUCE105.000000.0SalinasSanta ElenaD12202HolidayNationalEcuadorPrimer dia del anoFalse53.452015113
3220441297886.02015-01-0125.0PREPARED FOODS121.941000.0SalinasSanta ElenaD12202HolidayNationalEcuadorPrimer dia del anoFalse53.452015113
3220451297885.02015-01-0125.0POULTRY279.169980.0SalinasSanta ElenaD12202HolidayNationalEcuadorPrimer dia del anoFalse53.452015113
3220461297884.02015-01-0125.0PLAYERS AND ELECTRONICS0.000000.0SalinasSanta ElenaD12202HolidayNationalEcuadorPrimer dia del anoFalse53.452015113